160 research outputs found
Erratum: âEvaluating computerâaided detection algorithmsâ
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134826/1/mp5750.pd
Classifier design for computerâ aided diagnosis: Effects of finite sample size on the mean performance of classical and neural network classifiers
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/135032/1/mp8805.pd
Combined adaptive enhancement and regionâ growing segmentation of breast masses on digitized mammograms
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134789/1/mp8658.pd
Is this model reliable for everyone? Testing for strong calibration
In a well-calibrated risk prediction model, the average predicted probability
is close to the true event rate for any given subgroup. Such models are
reliable across heterogeneous populations and satisfy strong notions of
algorithmic fairness. However, the task of auditing a model for strong
calibration is well-known to be difficult -- particularly for machine learning
(ML) algorithms -- due to the sheer number of potential subgroups. As such,
common practice is to only assess calibration with respect to a few predefined
subgroups. Recent developments in goodness-of-fit testing offer potential
solutions but are not designed for settings with weak signal or where the
poorly calibrated subgroup is small, as they either overly subdivide the data
or fail to divide the data at all. We introduce a new testing procedure based
on the following insight: if we can reorder observations by their expected
residuals, there should be a change in the association between the predicted
and observed residuals along this sequence if a poorly calibrated subgroup
exists. This lets us reframe the problem of calibration testing into one of
changepoint detection, for which powerful methods already exist. We begin with
introducing a sample-splitting procedure where a portion of the data is used to
train a suite of candidate models for predicting the residual, and the
remaining data are used to perform a score-based cumulative sum (CUSUM) test.
To further improve power, we then extend this adaptive CUSUM test to
incorporate cross-validation, while maintaining Type I error control under
minimal assumptions. Compared to existing methods, the proposed procedure
consistently achieved higher power in simulation studies and more than doubled
the power when auditing a mortality risk prediction model
Comparison of similarity measures for the task of template matching of masses on serial mammograms
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134879/1/mp1892.pd
Improvement of computerized mass detection on mammograms: Fusion of twoâ view information
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/135080/1/mp6098.pd
Automated registration of breast lesions in temporal pairs of mammograms for interval change analysisâ local affine transformation for improved localization
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134991/1/mp6134.pd
Improvement of mammographic mass characterization using spiculation measures and morphological features
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/135119/1/mp1548.pd
Analysis of temporal changes of mammographic features: Computerâ aided classification of malignant and benign breast masses
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/135117/1/mp2242.pd
A new automated method for the segmentation and characterization of breast masses on ultrasound images
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/134986/1/mp0069.pd
- âŚ